Subjects with schizophrenia are impaired at reinforcement-driven reversal learning from as early as their first episode. The neurobiological basis of this deficit is unknown. We obtained behavioral and fMRI data in 24 unmedicated, primarily first episode, schizophrenia patients and 24 age-, IQ- and gender-matched healthy controls during a reversal learning task. We supplemented our fMRI analysis, focusing on learning from prediction errors, with detailed computational modeling to probe task solving strategy including an ability to deploy an internal goal directed model of the task. Patients displayed reduced functional activation in the ventral striatum (VS) elicited by prediction errors. However, modeling task performance revealed that a subgroup did not adjust their behavior according to an accurate internal model of the task structure, and these were also the more severely psychotic patients. In patients who could adapt their behavior, as well as in controls, task solving was best described by cognitive strategies according to a Hidden Markov Model. When we compared patients and controls who acted according to this strategy, patients still displayed a significant reduction in VS activation elicited by informative errors that precede salient changes of behavior (reversals). Thus, our study shows that VS dysfunction in schizophrenia patients during reward-related reversal learning remains a core deficit even when controlling for task solving strategies. This result highlights VS dysfunction is tightly linked to a reward-related reversal learning deficit in early, unmedicated schizophrenia patients.